-
Notifications
You must be signed in to change notification settings - Fork 43
[930][evaluation] implement CSVReader #932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iluise I have not tried it but I trust you did. Approved, and here are a couple of comments (small)
| self.csv_path = eval_cfg.get("csv_path") | ||
| assert self.csv_path is not None, "CSV path must be provided in the config." | ||
|
|
||
| self.data = pd.read_csv(self.csv_path, index_col=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing I would do is cast all the values to np.float32 (or float). pandas tries to be very clever and would for example use int32 if the data allows. I am not sure if xarray can deal with that later.
grassesi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you have a object that can be an instance of different subclasses and you want different behavior depending on the subclass, you should use polymorphism to implement it: Make a method with the same function signature in both subclasses and implement the different behavior there. The advantage is that the caller does not have to have any knowledge of the differnces (As you do here).
|
Hi @grassesi thanks for the suggestion, it looks much better now. I've updated the code and tested it. Can you check if this is what you expected and eventually approve the PR? |
grassesi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thats exactly what I meant 👍
* first version of quaver reader * working version * add CSVReader * rebase to develop * add polimorphism * fix names * lint
* training progress unit realignment from epoch to mini_epoch * small naming fix of mini_epoch * ruffed * linted * Fix out of bounds in data_reader_obs (#1180) * fix out of bounds access * Adding comment * Removed debgu * Fixed to use forward function for forecast engine (#1188) * Fixed to use forward function for forecast engine, and also fstep for conditioning * Fixed missing return statement * Enable FesomDataReader to have different source and target datasets (#1046) * Implement separate target and source files, adjust masking * Fix casual masking * Fix longitude conversion flag * Fix casual masking strategy --------- Co-authored-by: Seb Hickman <[email protected]> * Add support for constant learning rate (#1186) * Added support for constant learning rate and minor clean-up in code * Fixed issues with overlap between lr phases * Changing default lr to constant * [issue 1123] restore probabilistic scores (#1128) * rebase * add ensemble * fix deterministic * fix plotting * lint * fix eval_config * probabilistic scores working now * lint * Fix spoofing and refactor handling of multiple source files (#1118) * Cleaning up spoofing and related code on data preprocessing for model * Fixed typo * Updated comments * Removed merge cells and implemented necessary adjustments * Fixed forecasting * Fixed missing handling of NaNs in coordinates and channel data * Minor clean up * Fix to removing/renaming variables * Changed funtion name to improve readability * Fixed bug with incorrect handling of multiple input datasources. * Addressed reviewer comments * resolve conflict * [1131] fixes circular dependencies (#1134) * fixes dependencies * cleanup * make the type checker not fail * cleanup * cleanup of type issues * Give option to plot only prediction maps (#1139) * add plot_preds_only feature * minor changes after comments * Tell FSDP2 about embedding engine forward functions (#1133) * Tell FSDP2 about embedding engine forward functions Note DO NOT add print functions in forward functions of the model, it will break with FSDP2 * Add comment * recover 'all' option (#1146) * Fixed problem in inferecne (#1145) * implement vrmse (#1147) * [1144] Extra fixes (#1148) * Fixed problem in inferecne * more fixes * fixes * lint * lint --------- Co-authored-by: Christian Lessig <[email protected]> * Jk/log grad norms/log grad norms (#1068) * Log gradient norms * Prototype for recording grad norms * Address review changes + hide behind feature flag * Final fixes including backward compatibility * Ruff * More ruff stuff * forecast config with small decoder * fixed uv.lock * test gradient logging on mutli gpus * update uv.lock to latest develop version * revert to default confit * add comment on FSDP2 specifics * move plot grad script to private repo * rm seaborn from pyproject * updating terminal and metrics loggin, add get_tensor_item fct * check for DTensor instead of world size * revert forecast fct, fix in separate PR * rename grad_norm log names to exclude from MLFlow * add log_grad_norms to default config --------- Co-authored-by: sophiex <[email protected]> * Add forecast and observation activity (#1126) * Add calculation methods for forecast and observation activity metrics in Scores class * Add new calculation methods for forecast activity metrics in Scores class * ruff * fix func name * Rename observation activity calculation method to target activity in Scores class * typo * refactor to common calc_act function for activity * fix cases * have calc_tact and calc_fact that use _calc_act for maintainability * fix small thing in style --------- Co-authored-by: iluise <[email protected]> * hotfix: use correct methot `create` instead of `construct` (#1090) * restore develop * fix deterministic * fix plotting * lint * fix eval_config * probabilistic scores working now * lint * update utils * packages/evaluate/src/weathergen/evaluate/score.py * lint * removing duplication --------- Co-authored-by: Christian Lessig <[email protected]> Co-authored-by: Timothy Hunter <[email protected]> Co-authored-by: Savvas Melidonis <[email protected]> Co-authored-by: Sophie X <[email protected]> Co-authored-by: Julius Polz <[email protected]> Co-authored-by: Julian Kuehnert <[email protected]> Co-authored-by: Simon Grasse <[email protected]> * Adding config to issue templates The issue template seems to have disappeared, attempting to solve that. * Add the duration of animation as global plotting option (#1189) * Add the animation duration as global plotting option * Linting * Use FPS instead of milliseconds * Linting * Attempt to fix the bug report template * Attempt to fix initiative template * Update task template * [1081][Evaluation] Use parent ruff rules (#1177) * use ruff settings from parent * fix code checks * check fixes 2nd round * reformat to line length * [1092] Adds pushing metrics to the evaluation pipeline (#1127) * changes * changes * changes * changes * changes * scores successfully pushed to MLFlow, still need to refactor * try to batch upload all metrics form same runid * batch logging all scores of each run_id * get parent_run by from_run_id * changes * cleanups * bug fixes * typing issue * Cleanup * pdb * integration test --------- Co-authored-by: Jubeku <[email protected]> * Fix the issue - "Empty source still have embedding network" (#1114) * Replace cf.rank==0 with utils.distributed.is_root * fix empty source inputs still have embedding layer * fix lint * fix source empty or source exclude all * fix source empty or source exclude all * fix forecast mode empty source --------- Co-authored-by: wang85 <[email protected]> Co-authored-by: wang85 <[email protected]> Co-authored-by: wang85 <[email protected]> Co-authored-by: wang85 <[email protected]> * [930][evaluation] implement CSVReader (#932) * first version of quaver reader * working version * add CSVReader * rebase to develop * add polimorphism * fix names * lint * Iluise/hot fixes (#1209) * fix froct * fix 1150 * Fix plot_train verbosity (#1225) * [1206] Experimentation for extra data readers (#1207) * initial implementation * changes * toml * add module to annotations.json (#1142) Co-authored-by: Javad Kasravi <[email protected]> * Correct bug with score cards and bar plots for different metrics (#1192) * Rebase to develop * Linting * Address comments and linting * [eval][1122] Plot scores on a map (#1176) * first version of score maps * add maps to compute_scores * fix single sample situation * fix single sample * lint * restore score.py * fix bug in metric stream * default flag to false * Minor correction, a line was deleted by mistake? (#1193) * fix * working setup for regridded data * fix missing valid time case * lint and fix color in score cards * fix path for score maps * Allow plotting score maps every time --------- Co-authored-by: Savvas Melidonis <[email protected]> * Fix DDP without FSDP (#1227) * Fix DDP without FSDP * Fixed taht freezing would not have worked with only DDP * refactor export scripts - Part 1 (#1223) * move stuff around * move stuff around * rename files * [1034][reader_extra] E-Obs datareader (#1228) * [1034] rebase * [1034] add dataloader * [1034] Zarr3-->Zarr2 * [1034] lint * [1034] lint * [1034] Moved to reader_extra * [1034] registry E-Obs * training progress unit realignment from epoch to mini_epoch * ruffed * check if path is dir in io_reader * fix overwrite of fname_zarr in io_reader * add backward compatibility to config read * Separate write and read functions for model*chkpt*.json files (MatKBauer) --------- Co-authored-by: Christian Lessig <[email protected]> Co-authored-by: Kacper Nowak <[email protected]> Co-authored-by: Seb Hickman <[email protected]> Co-authored-by: iluise <[email protected]> Co-authored-by: Timothy Hunter <[email protected]> Co-authored-by: Savvas Melidonis <[email protected]> Co-authored-by: Sophie X <[email protected]> Co-authored-by: Julius Polz <[email protected]> Co-authored-by: Julian Kuehnert <[email protected]> Co-authored-by: Simon Grasse <[email protected]> Co-authored-by: Michael Tarnawa <[email protected]> Co-authored-by: Jubeku <[email protected]> Co-authored-by: Jifeng Wang <[email protected]> Co-authored-by: wang85 <[email protected]> Co-authored-by: wang85 <[email protected]> Co-authored-by: wang85 <[email protected]> Co-authored-by: wang85 <[email protected]> Co-authored-by: Javad kasravi <[email protected]> Co-authored-by: Javad Kasravi <[email protected]> Co-authored-by: Simone Norberti <[email protected]>
Description
Add reader to retrieve CSV scores from files generated with Quaver. @Jubeku
quaver scores can be plotted in the FastEvaluation package by adding this in the config (you need to have them locally first):
Issue Number
Closes #930
Is this PR a draft? Mark it as draft.
Checklist before asking for review
./scripts/actions.sh lint./scripts/actions.sh unit-test./scripts/actions.sh integration-testlaunch-slurm.py --time 60